Skip to content

fix(bio-research): download FASTQ files over HTTPS instead of HTTP#173

Open
tejas-dharani wants to merge 1 commit intoanthropics:mainfrom
tejas-dharani:fix/bio-research-fastq-https
Open

fix(bio-research): download FASTQ files over HTTPS instead of HTTP#173
tejas-dharani wants to merge 1 commit intoanthropics:mainfrom
tejas-dharani:fix/bio-research-fastq-https

Conversation

@tejas-dharani
Copy link
Copy Markdown

Problem

ncbi_utils.py builds FASTQ download URLs using plain HTTP:

# Line 343
urls = [f"http://{url}" for url in ftp_urls.split(';') if url]

ENA returns FTP paths like ftp.sra.ebi.ac.uk/vol1/fastq/... which get converted to http://ftp.sra.ebi.ac.uk/.... These are multi-GB genomic files downloaded with no transport encryption — content can be modified in transit.

The ENA API query itself is correctly over HTTPS (line 314), but the actual file downloads are not.

Fix

Change http:// to https:// on line 343. ENA supports HTTPS downloads on the same paths.

Changes

  • bio-research/skills/nextflow-development/scripts/utils/ncbi_utils.py line 343: http://https://

Testing

ENA HTTPS endpoint confirmed accessible. No behavior change beyond encrypted transport.

Closes #166

ENA FTP paths were converted to plain HTTP URLs, meaning multi-GB
genomic downloads had no transport-layer encryption. ENA supports
HTTPS on the same paths. Changed http:// to https:// on line 343.

Fixes anthropics#166
@tejas-dharani tejas-dharani force-pushed the fix/bio-research-fastq-https branch from c28aee5 to 9dcf05c Compare April 8, 2026 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] SSRF + path traversal chain in bio-research ncbi_utils.py and sra_geo_fetch.py

1 participant